Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 5693 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 622.8 KiB |
| Average record size in memory | 112.0 B |
Variable types
| Numeric | 14 |
|---|
revenue is highly correlated with quantity_orders and 3 other fields | High correlation |
recency is highly correlated with quantity_orders and 2 other fields | High correlation |
quantity_orders is highly correlated with revenue and 7 other fields | High correlation |
quantity_items_purchased is highly correlated with revenue and 4 other fields | High correlation |
avg_ticket is highly correlated with revenue and 3 other fields | High correlation |
avg_recency is highly correlated with recency and 2 other fields | High correlation |
frequency is highly correlated with recency and 2 other fields | High correlation |
frequency_btwn_purchases is highly correlated with quantity_orders and 1 other fields | High correlation |
avg_basket_size is highly correlated with revenue and 3 other fields | High correlation |
avg_unique_basked_size is highly correlated with avg_ticket and 1 other fields | High correlation |
quantity_items_returned is highly correlated with quantity_orders and 1 other fields | High correlation |
monetary_returned is highly correlated with quantity_orders and 1 other fields | High correlation |
revenue is highly correlated with quantity_orders and 1 other fields | High correlation |
recency is highly correlated with avg_recency | High correlation |
quantity_orders is highly correlated with revenue and 1 other fields | High correlation |
quantity_items_purchased is highly correlated with revenue and 1 other fields | High correlation |
avg_ticket is highly correlated with avg_basket_size | High correlation |
avg_recency is highly correlated with recency | High correlation |
avg_basket_size is highly correlated with avg_ticket | High correlation |
quantity_items_returned is highly correlated with monetary_returned | High correlation |
monetary_returned is highly correlated with quantity_items_returned | High correlation |
revenue is highly correlated with quantity_orders and 3 other fields | High correlation |
recency is highly correlated with avg_recency and 1 other fields | High correlation |
quantity_orders is highly correlated with revenue and 2 other fields | High correlation |
quantity_items_purchased is highly correlated with revenue and 2 other fields | High correlation |
avg_ticket is highly correlated with revenue and 1 other fields | High correlation |
avg_recency is highly correlated with recency and 1 other fields | High correlation |
frequency is highly correlated with recency and 1 other fields | High correlation |
frequency_btwn_purchases is highly correlated with quantity_orders | High correlation |
avg_basket_size is highly correlated with revenue and 2 other fields | High correlation |
quantity_items_returned is highly correlated with monetary_returned | High correlation |
monetary_returned is highly correlated with quantity_items_returned | High correlation |
customer_id is highly correlated with recency and 2 other fields | High correlation |
revenue is highly correlated with quantity_orders and 4 other fields | High correlation |
recency is highly correlated with customer_id and 2 other fields | High correlation |
quantity_orders is highly correlated with revenue and 3 other fields | High correlation |
quantity_items_purchased is highly correlated with revenue and 4 other fields | High correlation |
avg_ticket is highly correlated with avg_basket_size and 1 other fields | High correlation |
avg_recency is highly correlated with customer_id and 2 other fields | High correlation |
time_in_base is highly correlated with customer_id and 2 other fields | High correlation |
frequency is highly correlated with revenue and 3 other fields | High correlation |
avg_basket_size is highly correlated with avg_ticket and 2 other fields | High correlation |
avg_unique_basked_size is highly correlated with avg_ticket and 1 other fields | High correlation |
quantity_items_returned is highly correlated with revenue and 5 other fields | High correlation |
monetary_returned is highly correlated with revenue and 2 other fields | High correlation |
revenue is highly skewed (γ1 = 23.01115224) | Skewed |
quantity_items_purchased is highly skewed (γ1 = 25.09928996) | Skewed |
avg_ticket is highly skewed (γ1 = 20.84844077) | Skewed |
quantity_items_returned is highly skewed (γ1 = 29.45080834) | Skewed |
monetary_returned is highly skewed (γ1 = 35.43664486) | Skewed |
customer_id has unique values | Unique |
quantity_items_returned has 4190 (73.6%) zeros | Zeros |
monetary_returned has 4190 (73.6%) zeros | Zeros |
Reproduction
| Analysis started | 2022-05-04 19:22:21.346532 |
|---|---|
| Analysis finished | 2022-05-04 19:23:27.797284 |
| Duration | 1 minute and 6.45 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 5693 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16601.48287 |
| Minimum | 12347 |
|---|---|
| Maximum | 22709 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 12347 |
|---|---|
| 5-th percentile | 12700.6 |
| Q1 | 14289 |
| median | 16229 |
| Q3 | 18211 |
| 95-th percentile | 21732.8 |
| Maximum | 22709 |
| Range | 10362 |
| Interquartile range (IQR) | 3922 |
Descriptive statistics
| Standard deviation | 2808.14998 |
|---|---|
| Coefficient of variation (CV) | 0.1691505513 |
| Kurtosis | -0.8215480997 |
| Mean | 16601.48287 |
| Median Absolute Deviation (MAD) | 1962 |
| Skewness | 0.4411393053 |
| Sum | 94512242 |
| Variance | 7885706.31 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 17850 | 1 | < 0.1% |
| 16123 | 1 | < 0.1% |
| 15335 | 1 | < 0.1% |
| 17534 | 1 | < 0.1% |
| 17205 | 1 | < 0.1% |
| 16412 | 1 | < 0.1% |
| 13923 | 1 | < 0.1% |
| 17520 | 1 | < 0.1% |
| 17201 | 1 | < 0.1% |
| 16563 | 1 | < 0.1% |
| Other values (5683) | 5683 |
| Value | Count | Frequency (%) |
| 12347 | 1 | |
| 12348 | 1 | |
| 12349 | 1 | |
| 12350 | 1 | |
| 12352 | 1 | |
| 12353 | 1 | |
| 12354 | 1 | |
| 12355 | 1 | |
| 12356 | 1 | |
| 12357 | 1 |
| Value | Count | Frequency (%) |
| 22709 | 1 | |
| 22708 | 1 | |
| 22707 | 1 | |
| 22706 | 1 | |
| 22705 | 1 | |
| 22704 | 1 | |
| 22700 | 1 | |
| 22699 | 1 | |
| 22696 | 1 | |
| 22695 | 1 |
| Distinct | 5447 |
|---|---|
| Distinct (%) | 95.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1761.340198 |
| Minimum | 0.42 |
|---|---|
| Maximum | 279138.02 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.42 |
|---|---|
| 5-th percentile | 13.128 |
| Q1 | 236.18 |
| median | 613.2 |
| Q3 | 1570.45 |
| 95-th percentile | 5302.148 |
| Maximum | 279138.02 |
| Range | 279137.6 |
| Interquartile range (IQR) | 1334.27 |
Descriptive statistics
| Standard deviation | 7517.330182 |
|---|---|
| Coefficient of variation (CV) | 4.26796038 |
| Kurtosis | 697.9733466 |
| Mean | 1761.340198 |
| Median Absolute Deviation (MAD) | 479.16 |
| Skewness | 23.01115224 |
| Sum | 10027309.75 |
| Variance | 56510253.07 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7.95 | 9 | 0.2% |
| 1.25 | 8 | 0.1% |
| 4.95 | 8 | 0.1% |
| 2.95 | 8 | 0.1% |
| 12.75 | 7 | 0.1% |
| 1.65 | 7 | 0.1% |
| 3.75 | 7 | 0.1% |
| 5.95 | 6 | 0.1% |
| 7.5 | 6 | 0.1% |
| 4.25 | 6 | 0.1% |
| Other values (5437) | 5621 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | < 0.1% |
| 0.65 | 1 | < 0.1% |
| 0.79 | 1 | < 0.1% |
| 0.84 | 4 | |
| 0.85 | 3 | 0.1% |
| 1.07 | 1 | < 0.1% |
| 1.25 | 8 | |
| 1.44 | 1 | < 0.1% |
| 1.65 | 7 | |
| 1.69 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 279138.02 | 1 | |
| 259657.3 | 1 | |
| 194550.79 | 1 | |
| 140450.72 | 1 | |
| 124564.53 | 1 | |
| 117379.63 | 1 | |
| 91062.38 | 1 | |
| 72882.09 | 1 | |
| 66653.56 | 1 | |
| 65039.62 | 1 |
| Distinct | 304 |
|---|---|
| Distinct (%) | 5.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 116.8909187 |
| Minimum | 0 |
|---|---|
| Maximum | 373 |
| Zeros | 37 |
| Zeros (%) | 0.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 23 |
| median | 71 |
| Q3 | 200 |
| 95-th percentile | 338 |
| Maximum | 373 |
| Range | 373 |
| Interquartile range (IQR) | 177 |
Descriptive statistics
| Standard deviation | 111.6046783 |
|---|---|
| Coefficient of variation (CV) | 0.9547762958 |
| Kurtosis | -0.6424840601 |
| Mean | 116.8909187 |
| Median Absolute Deviation (MAD) | 61 |
| Skewness | 0.8143393856 |
| Sum | 665460 |
| Variance | 12455.60423 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 110 | 1.9% |
| 4 | 105 | 1.8% |
| 3 | 98 | 1.7% |
| 2 | 91 | 1.6% |
| 10 | 86 | 1.5% |
| 8 | 82 | 1.4% |
| 17 | 79 | 1.4% |
| 9 | 79 | 1.4% |
| 7 | 78 | 1.4% |
| 15 | 67 | 1.2% |
| Other values (294) | 4818 |
| Value | Count | Frequency (%) |
| 0 | 37 | 0.6% |
| 1 | 110 | |
| 2 | 91 | |
| 3 | 98 | |
| 4 | 105 | |
| 5 | 52 | |
| 7 | 78 | |
| 8 | 82 | |
| 9 | 79 | |
| 10 | 86 |
| Value | Count | Frequency (%) |
| 373 | 23 | |
| 372 | 22 | |
| 371 | 17 | |
| 369 | 4 | 0.1% |
| 368 | 13 | |
| 367 | 16 | |
| 366 | 15 | |
| 365 | 19 | |
| 364 | 11 | |
| 362 | 7 | 0.1% |
quantity_orders
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 56 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.470402248 |
| Minimum | 1 |
|---|---|
| Maximum | 206 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 4 |
| 95-th percentile | 11 |
| Maximum | 206 |
| Range | 205 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 6.81053531 |
|---|---|
| Coefficient of variation (CV) | 1.962462799 |
| Kurtosis | 302.473391 |
| Mean | 3.470402248 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 13.19909941 |
| Sum | 19757 |
| Variance | 46.38339121 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 2870 | |
| 2 | 826 | 14.5% |
| 3 | 501 | 8.8% |
| 4 | 395 | 6.9% |
| 5 | 236 | 4.1% |
| 6 | 173 | 3.0% |
| 7 | 139 | 2.4% |
| 8 | 98 | 1.7% |
| 9 | 68 | 1.2% |
| 10 | 55 | 1.0% |
| Other values (46) | 332 | 5.8% |
| Value | Count | Frequency (%) |
| 1 | 2870 | |
| 2 | 826 | 14.5% |
| 3 | 501 | 8.8% |
| 4 | 395 | 6.9% |
| 5 | 236 | 4.1% |
| 6 | 173 | 3.0% |
| 7 | 139 | 2.4% |
| 8 | 98 | 1.7% |
| 9 | 68 | 1.2% |
| 10 | 55 | 1.0% |
| Value | Count | Frequency (%) |
| 206 | 1 | |
| 199 | 1 | |
| 124 | 1 | |
| 97 | 1 | |
| 91 | 1 | |
| 90 | 1 | |
| 86 | 1 | |
| 72 | 1 | |
| 62 | 2 | |
| 60 | 1 |
quantity_items_purchased
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWED| Distinct | 1840 |
|---|---|
| Distinct (%) | 32.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 951.7265062 |
| Minimum | 1 |
|---|---|
| Maximum | 196844 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 106 |
| median | 317 |
| Q3 | 804 |
| 95-th percentile | 2925.2 |
| Maximum | 196844 |
| Range | 196843 |
| Interquartile range (IQR) | 698 |
Descriptive statistics
| Standard deviation | 4189.903881 |
|---|---|
| Coefficient of variation (CV) | 4.402424282 |
| Kurtosis | 942.5150692 |
| Mean | 951.7265062 |
| Median Absolute Deviation (MAD) | 253 |
| Skewness | 25.09928996 |
| Sum | 5418179 |
| Variance | 17555294.53 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 113 | 2.0% |
| 2 | 73 | 1.3% |
| 3 | 51 | 0.9% |
| 4 | 49 | 0.9% |
| 5 | 35 | 0.6% |
| 6 | 29 | 0.5% |
| 12 | 25 | 0.4% |
| 88 | 22 | 0.4% |
| 72 | 21 | 0.4% |
| 7 | 20 | 0.4% |
| Other values (1830) | 5255 |
| Value | Count | Frequency (%) |
| 1 | 113 | |
| 2 | 73 | |
| 3 | 51 | |
| 4 | 49 | |
| 5 | 35 | 0.6% |
| 6 | 29 | 0.5% |
| 7 | 20 | 0.4% |
| 8 | 18 | 0.3% |
| 9 | 7 | 0.1% |
| 10 | 17 | 0.3% |
| Value | Count | Frequency (%) |
| 196844 | 1 | |
| 80263 | 1 | |
| 77373 | 1 | |
| 69993 | 1 | |
| 64549 | 1 | |
| 64124 | 1 | |
| 63312 | 1 | |
| 58343 | 1 | |
| 57885 | 1 | |
| 50255 | 1 |
avg_ticket
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWED| Distinct | 5452 |
|---|---|
| Distinct (%) | 95.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 554.0214815 |
| Minimum | 0.42 |
|---|---|
| Maximum | 52940.94 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.42 |
|---|---|
| 5-th percentile | 12.83 |
| Q1 | 158.95 |
| median | 297.38 |
| Q3 | 486.6742857 |
| 95-th percentile | 1840.048 |
| Maximum | 52940.94 |
| Range | 52940.52 |
| Interquartile range (IQR) | 327.7242857 |
Descriptive statistics
| Standard deviation | 1380.286726 |
|---|---|
| Coefficient of variation (CV) | 2.49139568 |
| Kurtosis | 694.0198078 |
| Mean | 554.0214815 |
| Median Absolute Deviation (MAD) | 152.3 |
| Skewness | 20.84844077 |
| Sum | 3154044.294 |
| Variance | 1905191.445 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7.95 | 9 | 0.2% |
| 1.25 | 8 | 0.1% |
| 2.95 | 8 | 0.1% |
| 4.95 | 8 | 0.1% |
| 12.75 | 7 | 0.1% |
| 3.75 | 7 | 0.1% |
| 1.65 | 7 | 0.1% |
| 5.95 | 6 | 0.1% |
| 7.5 | 6 | 0.1% |
| 4.25 | 6 | 0.1% |
| Other values (5442) | 5621 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | < 0.1% |
| 0.65 | 1 | < 0.1% |
| 0.79 | 1 | < 0.1% |
| 0.84 | 4 | |
| 0.85 | 3 | 0.1% |
| 1.07 | 1 | < 0.1% |
| 1.25 | 8 | |
| 1.44 | 1 | < 0.1% |
| 1.65 | 7 | |
| 1.69 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 52940.94 | 1 | |
| 50653.91 | 1 | |
| 21389.6 | 1 | |
| 18745.86 | 1 | |
| 14855.53 | 1 | |
| 14844.76667 | 1 | |
| 13305.5 | 1 | |
| 12681.58 | 1 | |
| 12633.67 | 1 | |
| 12172.09 | 1 |
| Distinct | 1181 |
|---|---|
| Distinct (%) | 20.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 123.9936202 |
| Minimum | 0 |
|---|---|
| Maximum | 373 |
| Zeros | 4 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 15 |
| Q1 | 44.125 |
| median | 86 |
| Q3 | 184 |
| 95-th percentile | 336.4 |
| Maximum | 373 |
| Range | 373 |
| Interquartile range (IQR) | 139.875 |
Descriptive statistics
| Standard deviation | 101.7956195 |
|---|---|
| Coefficient of variation (CV) | 0.8209746548 |
| Kurtosis | -0.2540875244 |
| Mean | 123.9936202 |
| Median Absolute Deviation (MAD) | 55.33333333 |
| Skewness | 0.9376025916 |
| Sum | 705895.6796 |
| Variance | 10362.34815 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 60 | 32 | 0.6% |
| 53 | 31 | 0.5% |
| 213 | 30 | 0.5% |
| 353 | 30 | 0.5% |
| 184 | 29 | 0.5% |
| 46 | 28 | 0.5% |
| 64 | 27 | 0.5% |
| 28 | 27 | 0.5% |
| 77 | 26 | 0.5% |
| 154 | 25 | 0.4% |
| Other values (1171) | 5408 |
| Value | Count | Frequency (%) |
| 0 | 4 | 0.1% |
| 1 | 11 | |
| 2 | 7 | 0.1% |
| 2.847328244 | 1 | < 0.1% |
| 3 | 13 | |
| 3.300884956 | 1 | < 0.1% |
| 3.330357143 | 1 | < 0.1% |
| 3.333333333 | 1 | < 0.1% |
| 4 | 18 | |
| 4.144444444 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 373 | 23 | |
| 372 | 21 | |
| 371 | 17 | |
| 369 | 4 | 0.1% |
| 368 | 13 | |
| 367 | 16 | |
| 366 | 14 | |
| 365 | 19 | |
| 364 | 11 | |
| 362 | 7 | 0.1% |
| Distinct | 305 |
|---|---|
| Distinct (%) | 5.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 217.2320393 |
| Minimum | 1 |
|---|---|
| Maximum | 374 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 25 |
| Q1 | 110 |
| median | 239 |
| Q3 | 319 |
| 95-th percentile | 370 |
| Maximum | 374 |
| Range | 373 |
| Interquartile range (IQR) | 209 |
Descriptive statistics
| Standard deviation | 116.5955573 |
|---|---|
| Coefficient of variation (CV) | 0.5367327841 |
| Kurtosis | -1.23398492 |
| Mean | 217.2320393 |
| Median Absolute Deviation (MAD) | 96 |
| Skewness | -0.2945052057 |
| Sum | 1236702 |
| Variance | 13594.52398 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 374 | 101 | 1.8% |
| 373 | 97 | 1.7% |
| 367 | 88 | 1.5% |
| 369 | 78 | 1.4% |
| 366 | 76 | 1.3% |
| 370 | 70 | 1.2% |
| 359 | 66 | 1.2% |
| 368 | 61 | 1.1% |
| 372 | 57 | 1.0% |
| 360 | 46 | 0.8% |
| Other values (295) | 4953 |
| Value | Count | Frequency (%) |
| 1 | 4 | 0.1% |
| 2 | 11 | |
| 3 | 7 | 0.1% |
| 4 | 13 | |
| 5 | 18 | |
| 6 | 9 | |
| 8 | 13 | |
| 9 | 6 | 0.1% |
| 10 | 14 | |
| 11 | 21 |
| Value | Count | Frequency (%) |
| 374 | 101 | |
| 373 | 97 | |
| 372 | 57 | |
| 370 | 70 | |
| 369 | 78 | |
| 368 | 61 | |
| 367 | 88 | |
| 366 | 76 | |
| 365 | 45 | |
| 363 | 32 | 0.6% |
| Distinct | 1222 |
|---|---|
| Distinct (%) | 21.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.02307007074 |
| Minimum | 0.002673796791 |
|---|---|
| Maximum | 1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.002673796791 |
|---|---|
| 5-th percentile | 0.00296735905 |
| Q1 | 0.005449591281 |
| median | 0.01201201201 |
| Q3 | 0.02388059701 |
| 95-th percentile | 0.0688233203 |
| Maximum | 1 |
| Range | 0.9973262032 |
| Interquartile range (IQR) | 0.01843100573 |
Descriptive statistics
| Standard deviation | 0.04829054538 |
|---|---|
| Coefficient of variation (CV) | 2.093211847 |
| Kurtosis | 173.2740082 |
| Mean | 0.02307007074 |
| Median Absolute Deviation (MAD) | 0.00750018311 |
| Skewness | 10.79591569 |
| Sum | 131.3379127 |
| Variance | 0.002331976773 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.01851851852 | 37 | 0.6% |
| 0.005405405405 | 32 | 0.6% |
| 0.01639344262 | 31 | 0.5% |
| 0.002824858757 | 30 | 0.5% |
| 0.004672897196 | 30 | 0.5% |
| 0.01538461538 | 29 | 0.5% |
| 0.05263157895 | 29 | 0.5% |
| 0.01923076923 | 28 | 0.5% |
| 0.025 | 27 | 0.5% |
| 0.04545454545 | 26 | 0.5% |
| Other values (1212) | 5394 |
| Value | Count | Frequency (%) |
| 0.002673796791 | 22 | |
| 0.002680965147 | 21 | |
| 0.002688172043 | 17 | |
| 0.002702702703 | 3 | 0.1% |
| 0.0027100271 | 13 | |
| 0.002717391304 | 16 | |
| 0.00272479564 | 14 | |
| 0.002732240437 | 19 | |
| 0.002739726027 | 11 | |
| 0.002754820937 | 7 | 0.1% |
| Value | Count | Frequency (%) |
| 1 | 5 | |
| 0.550802139 | 1 | < 0.1% |
| 0.5320855615 | 1 | < 0.1% |
| 0.5 | 11 | |
| 0.4 | 1 | < 0.1% |
| 0.3333333333 | 6 | |
| 0.3315508021 | 1 | < 0.1% |
| 0.3157894737 | 1 | < 0.1% |
| 0.2727272727 | 2 | < 0.1% |
| 0.2621621622 | 1 | < 0.1% |
| Distinct | 1225 |
|---|---|
| Distinct (%) | 21.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5475856325 |
| Minimum | 0.005449591281 |
|---|---|
| Maximum | 17 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.005449591281 |
|---|---|
| 5-th percentile | 0.01104159896 |
| Q1 | 0.02492211838 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 17 |
| Range | 16.99455041 |
| Interquartile range (IQR) | 0.9750778816 |
Descriptive statistics
| Standard deviation | 0.550614711 |
|---|---|
| Coefficient of variation (CV) | 1.005531698 |
| Kurtosis | 138.8163381 |
| Mean | 0.5475856325 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.852587408 |
| Sum | 3117.405006 |
| Variance | 0.3031765599 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 2878 | |
| 2 | 48 | 0.8% |
| 0.0625 | 18 | 0.3% |
| 0.02777777778 | 17 | 0.3% |
| 0.02380952381 | 16 | 0.3% |
| 0.09090909091 | 15 | 0.3% |
| 0.08333333333 | 15 | 0.3% |
| 0.03448275862 | 14 | 0.2% |
| 0.02941176471 | 14 | 0.2% |
| 0.01923076923 | 13 | 0.2% |
| Other values (1215) | 2645 |
| Value | Count | Frequency (%) |
| 0.005449591281 | 1 | < 0.1% |
| 0.005464480874 | 1 | < 0.1% |
| 0.005479452055 | 1 | < 0.1% |
| 0.005494505495 | 1 | < 0.1% |
| 0.005586592179 | 2 | |
| 0.005602240896 | 1 | < 0.1% |
| 0.005617977528 | 2 | |
| 0.00566572238 | 1 | < 0.1% |
| 0.005681818182 | 2 | |
| 0.005698005698 | 3 |
| Value | Count | Frequency (%) |
| 17 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 3 | 5 | 0.1% |
| 2 | 48 | 0.8% |
| 1.142857143 | 1 | < 0.1% |
| 1 | 2878 | |
| 0.75 | 1 | < 0.1% |
| 0.6666666667 | 3 | 0.1% |
| 0.550802139 | 1 | < 0.1% |
| 0.5335120643 | 1 | < 0.1% |
avg_basket_size
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 2369 |
|---|---|
| Distinct (%) | 41.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 248.215404 |
| Minimum | 1 |
|---|---|
| Maximum | 14149 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 75 |
| median | 152 |
| Q3 | 290.625 |
| 95-th percentile | 732.55 |
| Maximum | 14149 |
| Range | 14148 |
| Interquartile range (IQR) | 215.625 |
Descriptive statistics
| Standard deviation | 439.4962599 |
|---|---|
| Coefficient of variation (CV) | 1.770624437 |
| Kurtosis | 378.6707145 |
| Mean | 248.215404 |
| Median Absolute Deviation (MAD) | 96.57142857 |
| Skewness | 14.57359157 |
| Sum | 1413090.295 |
| Variance | 193156.9625 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 114 | 2.0% |
| 2 | 72 | 1.3% |
| 3 | 51 | 0.9% |
| 4 | 49 | 0.9% |
| 5 | 35 | 0.6% |
| 6 | 29 | 0.5% |
| 12 | 26 | 0.5% |
| 72 | 22 | 0.4% |
| 100 | 22 | 0.4% |
| 73 | 21 | 0.4% |
| Other values (2359) | 5252 |
| Value | Count | Frequency (%) |
| 1 | 114 | |
| 2 | 72 | |
| 3 | 51 | |
| 3.333333333 | 1 | < 0.1% |
| 4 | 49 | |
| 5 | 35 | 0.6% |
| 5.333333333 | 1 | < 0.1% |
| 5.666666667 | 1 | < 0.1% |
| 6 | 29 | 0.5% |
| 6.142857143 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 14149 | 1 | |
| 13956 | 1 | |
| 7824 | 1 | |
| 6009.333333 | 1 | |
| 5963 | 1 | |
| 5197 | 1 | |
| 4300 | 1 | |
| 4282 | 1 | |
| 4280 | 1 | |
| 4136 | 1 |
| Distinct | 1172 |
|---|---|
| Distinct (%) | 20.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37.27433576 |
| Minimum | 0.2 |
|---|---|
| Maximum | 1109 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0.2 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 7.25 |
| median | 15 |
| Q3 | 31 |
| 95-th percentile | 173 |
| Maximum | 1109 |
| Range | 1108.8 |
| Interquartile range (IQR) | 23.75 |
Descriptive statistics
| Standard deviation | 76.89257379 |
|---|---|
| Coefficient of variation (CV) | 2.062882469 |
| Kurtosis | 32.87848226 |
| Mean | 37.27433576 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 5.072750237 |
| Sum | 212202.7935 |
| Variance | 5912.467904 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 276 | 4.8% |
| 2 | 161 | 2.8% |
| 3 | 114 | 2.0% |
| 9 | 105 | 1.8% |
| 10 | 105 | 1.8% |
| 8 | 103 | 1.8% |
| 5 | 102 | 1.8% |
| 7 | 101 | 1.8% |
| 6 | 101 | 1.8% |
| 13 | 97 | 1.7% |
| Other values (1162) | 4428 |
| Value | Count | Frequency (%) |
| 0.2 | 1 | < 0.1% |
| 0.25 | 3 | 0.1% |
| 0.3333333333 | 7 | |
| 0.4 | 1 | < 0.1% |
| 0.4090909091 | 1 | < 0.1% |
| 0.5 | 12 | |
| 0.5454545455 | 1 | < 0.1% |
| 0.5555555556 | 1 | < 0.1% |
| 0.5714285714 | 1 | < 0.1% |
| 0.6176470588 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1109 | 1 | |
| 748 | 1 | |
| 730 | 1 | |
| 720 | 1 | |
| 703 | 1 | |
| 686 | 1 | |
| 675 | 1 | |
| 673 | 1 | |
| 660 | 1 | |
| 649 | 1 |
quantity_items_returned
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWEDZEROS| Distinct | 214 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 19.83769542 |
| Minimum | 0 |
|---|---|
| Maximum | 9360 |
| Zeros | 4190 |
| Zeros (%) | 73.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 38 |
| Maximum | 9360 |
| Range | 9360 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 239.433569 |
|---|---|
| Coefficient of variation (CV) | 12.06962623 |
| Kurtosis | 1016.856308 |
| Mean | 19.83769542 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 29.45080834 |
| Sum | 112936 |
| Variance | 57328.43395 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 4190 | |
| 1 | 169 | 3.0% |
| 2 | 150 | 2.6% |
| 3 | 105 | 1.8% |
| 4 | 89 | 1.6% |
| 6 | 78 | 1.4% |
| 5 | 61 | 1.1% |
| 12 | 52 | 0.9% |
| 7 | 44 | 0.8% |
| 8 | 43 | 0.8% |
| Other values (204) | 712 | 12.5% |
| Value | Count | Frequency (%) |
| 0 | 4190 | |
| 1 | 169 | 3.0% |
| 2 | 150 | 2.6% |
| 3 | 105 | 1.8% |
| 4 | 89 | 1.6% |
| 5 | 61 | 1.1% |
| 6 | 78 | 1.4% |
| 7 | 44 | 0.8% |
| 8 | 43 | 0.8% |
| 9 | 41 | 0.7% |
| Value | Count | Frequency (%) |
| 9360 | 1 | |
| 9014 | 1 | |
| 8004 | 1 | |
| 4427 | 1 | |
| 3768 | 1 | |
| 3332 | 1 | |
| 2878 | 1 | |
| 2022 | 1 | |
| 2012 | 1 | |
| 1776 | 1 |
monetary_returned
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWEDZEROS| Distinct | 1085 |
|---|---|
| Distinct (%) | 19.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 39.49914808 |
| Minimum | 0 |
|---|---|
| Maximum | 22998.4 |
| Zeros | 4190 |
| Zeros (%) | 73.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 44.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 3.75 |
| 95-th percentile | 105.936 |
| Maximum | 22998.4 |
| Range | 22998.4 |
| Interquartile range (IQR) | 3.75 |
Descriptive statistics
| Standard deviation | 438.5905076 |
|---|---|
| Coefficient of variation (CV) | 11.10379664 |
| Kurtosis | 1590.03785 |
| Mean | 39.49914808 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 35.43664486 |
| Sum | 224868.65 |
| Variance | 192361.6334 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 4190 | |
| 12.75 | 20 | 0.4% |
| 4.95 | 19 | 0.3% |
| 9.95 | 17 | 0.3% |
| 15 | 17 | 0.3% |
| 5.9 | 12 | 0.2% |
| 25.5 | 11 | 0.2% |
| 4.25 | 10 | 0.2% |
| 3.75 | 9 | 0.2% |
| 19.9 | 8 | 0.1% |
| Other values (1075) | 1380 | 24.2% |
| Value | Count | Frequency (%) |
| 0 | 4190 | |
| 0.42 | 2 | < 0.1% |
| 0.65 | 1 | < 0.1% |
| 0.95 | 1 | < 0.1% |
| 1.25 | 4 | 0.1% |
| 1.45 | 4 | 0.1% |
| 1.64 | 1 | < 0.1% |
| 1.65 | 5 | 0.1% |
| 1.7 | 2 | < 0.1% |
| 1.79 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 22998.4 | 1 | |
| 14688.24 | 1 | |
| 8511.15 | 1 | |
| 7443.59 | 1 | |
| 5228.4 | 1 | |
| 4815.26 | 1 | |
| 4814.74 | 1 | |
| 4486.24 | 1 | |
| 4429 | 1 | |
| 3677.15 | 1 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| customer_id | revenue | recency | quantity_orders | quantity_items_purchased | avg_ticket | avg_recency | time_in_base | frequency | frequency_btwn_purchases | avg_basket_size | avg_unique_basked_size | quantity_items_returned | monetary_returned | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 17850 | 5391.21 | 372 | 34 | 1733 | 158.565000 | 186.500000 | 374 | 0.090909 | 17.000000 | 50.970588 | 0.617647 | 40 | 102.58 |
| 1 | 13047 | 3232.59 | 56 | 9 | 1390 | 359.176667 | 53.285714 | 374 | 0.024064 | 0.028302 | 154.444444 | 11.666667 | 35 | 143.49 |
| 2 | 12583 | 6705.38 | 2 | 15 | 5028 | 447.025333 | 24.866667 | 374 | 0.040107 | 0.040323 | 335.200000 | 7.600000 | 50 | 76.04 |
| 3 | 13748 | 948.25 | 95 | 5 | 439 | 189.650000 | 93.250000 | 374 | 0.013369 | 0.017921 | 87.800000 | 4.800000 | 0 | 0.00 |
| 4 | 15100 | 876.00 | 333 | 3 | 80 | 292.000000 | 124.333333 | 374 | 0.008021 | 0.073171 | 26.666667 | 0.333333 | 22 | 240.90 |
| 5 | 15291 | 4623.30 | 25 | 14 | 2102 | 330.235714 | 26.642857 | 374 | 0.037433 | 0.040115 | 150.142857 | 4.357143 | 29 | 71.79 |
| 6 | 14688 | 5630.87 | 7 | 21 | 3621 | 268.136667 | 18.650000 | 374 | 0.056150 | 0.057221 | 172.428571 | 7.047619 | 399 | 523.49 |
| 7 | 17809 | 5411.91 | 16 | 12 | 2057 | 450.992500 | 37.300000 | 374 | 0.032086 | 0.033520 | 171.416667 | 3.833333 | 41 | 67.06 |
| 8 | 15311 | 60767.90 | 0 | 91 | 38194 | 667.779121 | 4.144444 | 374 | 0.243316 | 0.243316 | 419.714286 | 6.230769 | 474 | 1348.56 |
| 9 | 16098 | 2005.63 | 87 | 7 | 613 | 286.518571 | 53.285714 | 374 | 0.018717 | 0.024390 | 87.571429 | 4.857143 | 0 | 0.00 |
Last rows
| customer_id | revenue | recency | quantity_orders | quantity_items_purchased | avg_ticket | avg_recency | time_in_base | frequency | frequency_btwn_purchases | avg_basket_size | avg_unique_basked_size | quantity_items_returned | monetary_returned | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5683 | 22695 | 6083.95 | 1 | 1 | 1852 | 6083.95 | 1.0 | 2 | 0.5 | 1.0 | 1852.0 | 675.0 | 0 | 0.0 |
| 5684 | 22696 | 7150.07 | 1 | 1 | 2150 | 7150.07 | 1.0 | 2 | 0.5 | 1.0 | 2150.0 | 748.0 | 0 | 0.0 |
| 5685 | 22699 | 3686.80 | 1 | 1 | 691 | 3686.80 | 1.0 | 2 | 0.5 | 1.0 | 691.0 | 203.0 | 0 | 0.0 |
| 5686 | 22700 | 4839.42 | 1 | 1 | 1074 | 4839.42 | 1.0 | 2 | 0.5 | 1.0 | 1074.0 | 55.0 | 0 | 0.0 |
| 5687 | 22704 | 17.90 | 1 | 1 | 14 | 17.90 | 1.0 | 2 | 0.5 | 1.0 | 14.0 | 7.0 | 0 | 0.0 |
| 5688 | 22705 | 3.35 | 1 | 1 | 2 | 3.35 | 1.0 | 2 | 0.5 | 1.0 | 2.0 | 2.0 | 0 | 0.0 |
| 5689 | 22706 | 5699.00 | 1 | 1 | 1747 | 5699.00 | 1.0 | 2 | 0.5 | 1.0 | 1747.0 | 634.0 | 0 | 0.0 |
| 5690 | 22707 | 6756.06 | 0 | 1 | 2010 | 6756.06 | 0.0 | 1 | 1.0 | 1.0 | 2010.0 | 730.0 | 0 | 0.0 |
| 5691 | 22708 | 3217.20 | 0 | 1 | 654 | 3217.20 | 0.0 | 1 | 1.0 | 1.0 | 654.0 | 56.0 | 0 | 0.0 |
| 5692 | 22709 | 3950.72 | 0 | 1 | 731 | 3950.72 | 0.0 | 1 | 1.0 | 1.0 | 731.0 | 217.0 | 0 | 0.0 |